Summary table at
https://docs.google.com/spreadsheet/ccc?key=0AtV_gF766XZAdEVVOUpNMmJHN1RjUjVud3dtRjdwOWc#gid=0












Trinity

./Trinity.pl --seqType fq --JM 20000G --left /Volumes/web/whale/fish546/module3_assembly/filtered_106A_Female_Mix_GATCAG_L004_R1.fastq.gz --right /Volumes/web/whale/fish546/module3_assembly/filtered_106A_Female_Mix_GATCAG_L004_R2.fastq.gz --CPU 10 --output /Volumes/web/whale/fish546/module3_assembly 


./Trinity.pl --seqType fq --JM 20000G --left /fish546/module3_assembly/filtered_106A_Female_Mix_GATCAG_L004_R1.fastq.gz --right /fish546/module3_assembly/filtered_106A_Female_Mix_GATCAG_L004_R2.fastq.gz --CPU 10 --output /fish546/module3_assembly 

#fail




Writing out to NAS creating 60GB Temp file the crashed computer.

Working if local…???

No success




VELVET

./velveth /Volumes/web/whale/ 21 -fastq -short /Volumes/web/cnidarian/Olurida_trans_v2_trimmed%20libraries.fastq


robertsmac:velvet_1.1.07 sr320$ ./velveth /Volumes/web/whale/fish546/module3_assembly/ 21 -fastq -short /Volumes/web/whale/fish546/module3_assembly/SE_sm_filtered_LTL_PS63_AGTTCC_L005_R1.fastq
[0.000001] Reading FastQ file /Volumes/web/whale/fish546/module3_assembly/SE_sm_filtered_LTL_PS63_AGTTCC_L005_R1.fastq

No luck
……..



SOAP

genefish:soap Steven$ ./SOAPdenovo31mer all -s config -o fish546mod3

Version 1.05: released on July 29th, 2010

pregraph -s config -o fish546mod3
In config, 1 libs, max seq len 75, max name len 256

8 thread created
read from file:
 filtered_106A_Female_Mix_GATCAG_L004_R1.fastq.gz
read from file:
 filtered_106A_Female_Mix_GATCAG_L004_R2.fastq.gz
time spent on hash reads: 367s, 79646478 reads processed
[LIB] 0, avg_ins 200, reverse 0
368302795 nodes allocated, 4221263334 kmer in reads, 4221263334 kmer processed
338341102 linear nodes
time spent on marking linear nodes 3s
time spent on pre-graph construction: 370s

deLowKmer 0, deLowEdge 1
Start to remove tips of single frequency kmers short than 46
9357683 tips off
8 thread created
5256794 linear nodes
Start to remove tips which don't contribute the most links
kmer set 0 done
kmer set 1 done
kmer set 2 done
kmer set 3 done
kmer set 4 done
kmer set 5 done
kmer set 6 done
kmer set 7 done
820766 tips off
8 thread created
0 linear nodes
time spent on cutTipe: 143s


Completed in ~Hour

Complete files (input, config, file, output) available @
http://eagle.fish.washington.edu/whale/index.php?dir=fish546%2Fmodule3_assembly%2FSOAP_mod3_PE%2F


filtering contig output 

Galaxy17-[Filter_sequences_by_length_on_data_16].fasta<-- 33,096 Contigs

and @
http://eagle.fish.washington.edu/cnidarian/fish546/fish546_Module_3_SOAPgf_contigs.fa


Ran compute sequence length in Galaxy


Histogram of sequence length


XY Length and Coverage




Summary Statistics on Length (galaxy)
#sum mean stdev 0% 25% 50% 75% 100%
2.2802e+07 688.965 360.703 400 464 566 776 5823



Fusion Table

L (logsscale) v coverage







SOAP on iplant
CONFIG
max_rd_len=75
[LIB]
avg_ins=200
reverse_seq=0
asm_flags=3
rank=
q1=filtered_106A_Female_Mix_GATCAG_L004_R1.fastq.gz
q2=filtered_106A_Female_Mix_GATCAG_L004_R2.fastq.gz
[LIB]
avg_ins=
reverse_seq=0
asm_flags=3
rank=
q1=
q2=
[LIB]
avg_ins=--reverse_seqs3=0
reverse_seq=
asm_flags=3
rank=
q1=
q2=

ALL FILES @
http://eagle.fish.washington.edu/whale/index.php?dir=fish546%2Fmodule3_assembly%2Fiplant%2FSOAP_fish545_PEassembly-2013-01-24-11-47-11.901%2F


iplant_soap.mov


Contigs:
http://eagle.fish.washington.edu/whale/fish546/module3_assembly/iplant/SOAP_fish545_PEassembly-2013-01-24-11-47-11.901/SoapOutput.contig

37,082  at 400bp minimum

http://eagle.fish.washington.edu/cnidarian/fish546/fish546_module_3_SOAPiplant_contigs.fa






#sum mean stdev 0% 25% 50% 75% 100%
3.15707e+07 851.376 603.753 400 491 644 977 10274








Scaffolds












ABYSS on iplant

ALL FILES @
http://eagle.fish.washington.edu/whale/index.php?dir=fish546%2Fmodule3_assembly%2Fiplant%2FAbyss_fish546_PE-2013-01-24-11-50-07.639%2F


Going to assume Unitig file is final output
http://eagle.fish.washington.edu/whale/fish546/module3_assembly/iplant/Abyss_fish546_PE-2013-01-24-11-50-07.639/k25/assembly-unitigs.fa

25,028 sequences min 400




#sum mean stdev 0% 25% 50% 75% 100%
2.75774e+07 1101.86 884.089 400 546 797 1332 18616



http://eagle.fish.washington.edu/cnidarian/fish546/fish546_module_3_Abyss_iplant_contigs.fa



BLASTING SEQUENCES
d-128-95-149-219:bin sr320$ ./blastx -query /Volumes/web/whale/fish546/blast/query/fish546_module_3_Abyss_iplant_contigs.fa -db /Volumes/web/whale/fish546/blast/db/nr -out /Volumes/web/whale/fish546/blast/out/fish546_PEAbyss_NR.xml -outfmt 5 -max_target_seqs 1 -num_threads 8

./blastx -query /Volumes/web/whale/fish546/blast/query/fish546_module_3_Abyss_iplant_contigs.fa -db /Volumes/web/whale/fish546/blast/db/swissprot -out /Volumes/web/whale/fish546/blast/out/fish546_PEAbyss_swissprot.xml -outfmt 5 -max_target_seqs 1 -num_threads 8







Velvet on iPlant
#fail


CLC v4


http://eagle.fish.washington.edu/cnidarian/fish546/fish546_Module_3_CLCv4_contigs.fa


41,319 sequences 400bp



#sum mean stdev 0% 25% 50% 75% 100%
4.63808e+07 1122.51 984.788 400 526 768 1336 19213



Trying Abyss on Iplant again - changing kmer to 40…


#fail


CLC version 6



33,877 contigs minimum 400 bp




http://eagle.fish.washington.edu/cnidarian/fish546/fish546_Module_3_CLCv6_contigs.fa







#sum mean stdev 0% 25% 50% 75% 100%
4.50732e+07 1330.5 1272.28 401 545 858 1627 18596